Agent Modelling in Partially Observable Domains

نویسندگان

Pradeep Varakantham

Rajiv Maheswaran

Milind Tambe

چکیده

Monitoring selectivity is a key challenge faced by agents when modelling other agents(1) — agents cannot continually monitor others due to the computational burden of such monitoring and modelling, but lack of such monitoring and modelling leads to increased uncertainty about the state of other agents. Such monitoring selectivity is also crucially important when agents engage in planning in the presence of action and observation uncertainty. Formally, this paper focuses on an agent that uses a POMDP to plan its activities, in a multiagent setting, and illustrates the critical nature of the monitoring selectivity challenge in POMDPs. The paper presents heuristics to limit the amount of monitoring and modelling of other agents, where the heuristics exploit the reward structure and transition probabilities to automatically determine where to curtail such monitoring and modelling. We concretely illustrate our techniques in the domain of software personal assistants, and present some initial experimental results illustrating the efficiency

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Policies with External Memory

In order for an agent to perform well in partially observable domains, it is usually necessary for actions to depend on the history of observations. In this paper, we explore a stigmergic approach, in which the agent’s actions include the ability to set and clear bits in an external memory, and the external memory is included as part of the input to the agent. In this case, we need to learn a r...

متن کامل

Robot Navigation in Partially Observable Domains using Hierarchical Memory-Based Reinforcement Learning

 In this paper, we attempt to find a solution to the problem of robot navigation in a domain with partial observability. The domain is a grid-world with intersecting corridors, where the agent learns an optimal policy for navigation by making use of a hierarchical memory-based learning algorithm. We define a hierarchy of levels over which the agent abstracts the learning process, as well as it...

متن کامل

Risk-sensitive planning in partially observable environments

Partially Observable Markov Decision Process (POMDP) is a popular framework for planning under uncertainty in partially observable domains. Yet, the POMDP model is riskneutral in that it assumes that the agent is maximizing the expected reward of its actions. In contrast, in domains like financial planning, it is often required that the agent decisions are risk-sensitive (maximize the utility o...

متن کامل

Risk-Sensitive Planning in Partially Observable Environments

متن کامل

Improving Uncoordinated Collaboration in Partially Observable Domains with Imperfect Simultaneous Action Communication

Decentralised planning in partially observable multi-agent domains is limited by the interacting agents’ incomplete knowledge of their peers, which impacts their ability to work jointly towards a common goal. In this context, communication is often used as a means of observation exchange, which helps each agent in reducing uncertainty and acquiring a more centralised view of the world. However,...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

Agent Modelling in Partially Observable Domains

نویسندگان

چکیده

منابع مشابه

Learning Policies with External Memory

Robot Navigation in Partially Observable Domains using Hierarchical Memory-Based Reinforcement Learning

Risk-sensitive planning in partially observable environments

Risk-Sensitive Planning in Partially Observable Environments

Improving Uncoordinated Collaboration in Partially Observable Domains with Imperfect Simultaneous Action Communication

عنوان ژورنال:

اشتراک گذاری